Group comparisons in logit and probit using predicted probabilities1
نویسنده
چکیده
The comparison of groups in regression models for binary outcomes is complicated by an identification problem inherent in these models. Traditional tests of the equality of coefficients across groups confound the magnitude of the regression coefficients with residual variation. If the amount of residual variation differs between groups, the test can lead to incorrect conclusions (Allison 1999). Allison proposes a test for the equality of regression coefficients that removes the effect of group differences in residual variation by adding the assumption that the regression coefficients for some variables are identical across groups. In practice, a researcher is unlikely to have either empirical or theoretical justification for this assumption, in which case the Allison’s test can also lead to incorrect conclusions. An alternative approach, suggested here, uses predicted probabilities. Since predicted probabilities are unaffected by residual variation, tests of the equality of predicted probabilities across groups can be used for group comparisons without assuming the equality of the regression coefficients of some variables. Using predicted probabilities requires researchers to think differently about comparing groups. With tests of the equality of regression coefficients, a single test lets the researcher conclude easily whether the effects of a variable are equal across groups. Testing the equality of predicted probabilities requires multiple tests since group differences in predictions vary with the levels of the variables in the model. A researcher must examine group differences in predictions at multiple levels of the variables often requiring more complex conclusions on how groups differ in the effect of a variable. 1I thank Paul Allison, Ken Bollen, Rafe Stolzenberg, Pravin Trivedi, and Rich Williams for their comments.
منابع مشابه
Pii: S0191-2615(99)00022-3
The Multinomial Logit, discrete choice model of transport demand, has several restrictions when compared with the more general Multinomial Probit model. The most famous of these are that unobservable components of utilities should be mutually independent and homoskedastic. Correlation can be accommodated to a certain extent by the Hierarchical Logit model, but the problem of heteroskedasticity ...
متن کاملDivision of the Humanities and Social Sciences California Institute of Technology Pasadena, California 91125 When Politics and Models Collide: Estimating Models of Multi-party Elections
Theory: The spatial model of elections can better be represented by using conditional logit than by multinomial logit. The spatial model, and random utility models in general, suffer from a failure to adequately consider the sub stitutability of candidates sharing similar or identical issue positions. Hypotheses: Multinomial logit is not much better than successive applications of binomial log...
متن کاملProbit and nested logit models based on fuzzy measure
Inspired by the interactive discrete choice logit models [Aggarwal, 2019], this paper presents the advanced families of discrete choice models, such as nested logit, mixed logit, and probit models to consider the interaction among the attributes. Besides the DM's attitudinal character is also taken into consideration in the computation of choice probabilities. The proposed choice models make us...
متن کاملConfounded coefficients: Accurately comparing logit and probit coefficients across groups
The logit and probit models are critical parts of the management researcher's analytical arsenal. We often want to know if a covariate has the same effect for different groups, e.g., foreign and domestic firms. Unfortunately, many attempts to compare the effect of covariates across groups make the unwarranted assumption that each group has the same residual variation. If this is not the case, c...
متن کاملWorking Paper Series Categorical Data Categorical Data
Categorical outcome (or discrete outcome or qualitative response) regression models are models for a discrete dependent variable recording in which of two or more categories an outcome of interest lies. For binary data (two categories) probit and logit models or semiparametric methods are used. For multinomial data (more than two categories) that are unordered, common models are multinomial and...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009